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Abstract. Note. The results of this manuscript has been merged and pub- 
hshed with another paper of the same authors: A new approach to nonrepetitve 
sequences. 

A repetition of size h (/i 1) in a given sequence is a subsequence of consec- 
utive terms of the form: xx = xi . . . Xf^xi . . .Xf^. A sequence is nonrepetitive 
if it does not contain a repetition of any size. The remarkable construction 
of Thue asserts that 3 different symbols are enough to build an arbitrarily 
long nonrepetitive sequence. We consider game-theoretic versions of results on 
nonrepetitive sequences. A nonrepetitive game is played by two players who 
pick, one by one, consecutive terms of a sequence over a given set of symbols. 
The first player tries to avoid repetitions, while the second player, in contrast, 
wants to create them. Of course, by simple imitation, the second player can 
force lots of repetitions of size 1. However, as proved by Pegden [5], there is a 
strategy for the first player to build an arbitrarily long sequence over 37 sym- 
bols with no repetitions of size > 1. Our techniques allow to reduce 37 to 6. 
Another game we consider is an erase-repetition game. Here, whenever a rep- 
etition occurs, the repeated block is immediately erased and the next player 
to move continues the play. We prove that there is a strategy for the first 
player to build an arbitrarily long nonrepetitive sequence over 8 symbols. Our 
approach is inspired by a new algorithmic proof of the Lovasz Local Lemma 
due to Moser and Tardos [4] and previous work of Moser (his so called entropy 
compression argument). 



1. Introduction 

A repetition of size h {h ^ 1) m a sequence is a subsequence of consecutive terms 
of the form: xx = xi . . .XhXi . . .Xh- A sequence is nonrepetitive if it does not 
contain a repetition of any size. 

In 1906 Thue [6] (see also [1]) proved that there exist arbitrarily long nonrepet- 
itive sequences over only 3 different symbols. The method discovered by Thue is 
constructive and uses substitutions over a given set of symbols. Recently [3] a 
completely different approach to creating long nonrepetitive sequences emerged. 
Consider the following naive procedure: generate consecutive terms of a sequence 
by choosing symbols at random (uniformly and independently) and every time a 
repetition occurs, erase the repeated block and continue. For instance, if the gen- 
erated sequence is ahchc, we must cancel the last two symbols, which brings us 
back to ahc. By a simple counting one can prove that with positive probability the 
length of a constructed sequence exceeds any finite bound, provided the number 
of symbols is at least 4. This is slightly weaker than Thue's result, but the argu- 
ment seems to be more flexible for adaptations to other settings. It is proved in 
[3] that for every \ and every sequence of sets Li, . . . , i„, each of size 4, there 
is a nonrepetitive sequence si, . . . , s„ where Si & Li. The analogous statement for 



Date: January 15, 2013. 

Key words and phrases. Thue, nonrepetive sequence. 

Research of authors were supported by the Polish Ministry of Science and Higher Education 
grants: N206 2570 35, N206 2728 33, N206 3761 37. 



1 



2 



J. GRYTCZUK, J. KOZIK, AND P. MICEK 



lists of size 3 remains an exciting open problem. In this paper we make use of the 
above-mentioned approach to games involving nonrepetitve sequences. 

The nonrepetitive game over a symbol set S is played by two players in the 
following way. The players collectively build a sequence choosing from S, one by 
one, consecutive terms of the sequence. The first player, Ann, is trying to avoid 
repetitions, while the second player, Ben, not necessarily cooperates. Of course, 
just by mimicking Ann's moves Ben can force a lot of repetitions of size 1. It turns 
out however that for large enough S he cannot force any larger repetition at all! 
Pegden [5] , using his extension of the Lovasz Local Lemma, proved that Ann has 
a strategy in the nonrepetitive game to build an arbitrarily long sequence with no 
repetition of size greater than 1 over symbol set of size at least 37 (no matter how 
perfidiously Ben is playing). In this paper we prove (Theorem 2) that Ann can do 
the same on every set of symbols of size at least 6. On the other hand, Ben can 
easily force nontrivial repetitions in a game on just 3 symbols. Thus, the minimum 
size of a set of symbols required to ensure Ann's strategy is 4, 5 or 6. 

The erase-repetition game over a set of symbols S is also a two-player game 
between Ann and Ben. say. As before they build a sequence picking symbols alter- 
nately from S and appending them to the end of the sequence built so far. But this 
time whenever a repetition occurs the repeated block is immediately erased and 
the next player continues extending the remaining prefix of the sequence. We prove 
(Theorem 1) that there is a strategy for Ann in this game to build an arbitrarily 
long nonrepetitive sequence over at least 8 symbols. 

The proof of the bound for the erase-repetition game is simpler, therefore it is 
presented first. 

2. Preliminaries 

We make some use of generating functions theory. We consider only algebraic 
functions. A generating functions t{z) = J^n "^nz" with positive radius of conver- 
gence is algebraic if there exists a nonconstant polynomial P{z, t) G C[z, t] {defining 
polynomial) such that P(z^t{z)) is constantly zero within the disc of convergence 
of t{z). It is a well known fact that, if the radius of convergence of J^n'^'^^^ '^^ 
strictly greater than a, then T„ = o(a^"). The following observation is fundamen- 
tal in analysis of algebraic generating functions, the thorough study of which can 
be found in [2] (chapter VII. 7). 

Observation. Let t(z) = ^„ Tnz" be a nonpolynomial algebraic generating func- 
tion with defining polynomial P{z,t). Then the radius of convergence oft{z) is one 
of the roots of discriminant of P{z,t) with respect to variable t (i.e. the resultant 
of P{z,t) and dtP{z,t) with respect to t). 

The coefficients of the functions we use are nonnegative integers. In such cases it 
is known that the radius of convergence is not greater than 1, whenever a function 
has infinite number of nonzero coefficients. In order to bound the growth of the 
sequence of coefficients of such a function, we calculate discriminant of its defining 
polynomial P{z,t) with respect to variable t, and look for its positive real root in 
the interval (0, 1]. If there is only one such root, it must be the radius of convergence 
of the function. 

3. The erase-repetition game 

Theorem 1. In the erase-repetition game over a symbol set of size 8, there exists 
a strategy for Ann to build an arbitrarily long nonrepetitive sequence. 

Proof. We fix n and prove that Ann has a strategy to build a nonrepetitive sequence 
of size n. In fact, the strategy for Ann will be randomized and we will show that 
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for every strategy of Ben there is an evaluation of random experiments leading to 
the sequence of size n against that strategy. The fact that for every strategy of 
Ben there is a strategy for Ann to build a sequence of size n implies that Ann has 
a strategy to build such a sequence in general. Then, by a routine application of 
Konig's Infinity Lemma, we get the thesis. 

Let C be the size of a symbol set. The argument to be presented turns out to 
work for C ^ 8. The strategy for Ann is the following: choose a random clement 
distinct from the last three symbols in the sequence constructed so far. In this 
setting, Ann docs not generate repetitions of size 1, 2 and 3. Obviously, Ben can 
cause many repetitions of size 1 but repetitions of size 2 and 3 are not possible. 
Indeed, in order to get a repetition of the form 'abcabc' the last three symbols 
must be generated by Ben. Consider Ann's move just before Ben puts 'b' in the 
repeated block. As she could not play preceding symbol 'a' she must have invoked 
a repetition. But all her repetitions are of size at least 4 and therefore the repeated 
block must have ended up with 'abca'. This would mean that she played 'a' which 
is not possible as this symbol is not distinct from the last three in the current 
sequence at that step. Analogous argument proves that repetitions of size 2 are 
also impossible. 

Fix n and a strategy of Ben. Take AI sufficiently large and consider possible 
scenarios of the first 2M moves of the game against that fixed Ben's strategy. 
Suppose, for a contradiction, that the size of a sequence after 2M moves is always 
(for any evaluation of Ann's choices) less than n. Ann generates exactly M elements. 
Let rj (1 ^ j ^ M) be the jth symbol generated by Ann. Clearly, ri, . . . , tm is a 
sequence of random variables with at least (C — 3)^^ possible evaluations. When 
we fix an evaluation of ri, . . . , tm the course of the whole game is determined. 

Let hj (1 ^ j ^ 2M) be the length of the sequence generated after j moves 
(including possible erasure invoked by the jth move) and let di, . . . , d2M be the 
sequence of differences: di = 1, dj = hj — /ij-i for 2 ^ j ^ 2M. Note that dj = 1 
means that there is no erasure after jth move and dj < 1 indicates that repeated 
block of size |c?j| + 1 was removed. A pair {D,S) is a game log if there is an 
evaluation of ri , . . . , tm such that D is the sequence of differences and S is the 
final sequence produced after 2M moves. A pair (Z?, S) is a reduced game log if 
it is a log but with all zeros in D erased. Note that any sequence of differences 
D = (di, . . . , dm) in a reduced log satisfies: 

(i) m 2M, 

(ii) dj e {1, -3, -4, -5, . . .}, for all 1 ^ j m, 

(iii) J2'j=i d-j ^ 1, for all 1 ^ /c ^ m. 

Claim. Every reduced log corresponds to a unique evaluation o/ri, . . . , rjv/. 

Proof. Given a reduced log ((di, . . . , d„i), Sm) with S„i = (si, . . . , s/), we decode 
all random choices taken by Ann in two steps. First we reconstruct the sequence 
xi, . . . , Xm of all symbols introduced in the game except those (of Ben) generating 
repetitions of size 1. The introduced symbols generating repetitions of size 1 are 
called bad, other symbols are good. The move is good (bad) if a good (bad) symbol 
is introduced. The number of good moves played is exactly the size of the difference 
sequence in the reduced log, namely to. Note that Sm is the sequence formed after 
the mth good move (even if a bad move is played afterwards, it does not change 
the sequence). 

We reconstruct the sequence of good symbols backwards, i.e., we first decode 
Xm, which is the last good symbol introduced, and the sequence Sm-i constructed 
after m — 1 good moves. Then, by simple iteration, we extract all the remaining 
good symbols Xm-i, . ■ . ,xi. 
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If d„i — 1, then the mth good symbol introduced did not invoke a repetition. 
Thus, the last good symbol introduced is the last symbol of the final sequence, i.e. 
Xm = Si and S^-i = (si, . . . , si-i). 

If dm ^ 0, then some symbols were erased after the mth good move. But since 
we know the size of the repetition, namely h — \dm\ + 1, and only one half of it 
was erased; we can read and copy the first part of the repeated block to restore 
Sm^i (si, . . . , si,si_h+i, • ■ • , si-i) and x^. = s/. 

Once we get all xi, . . . , Xm, we read the sequence from the beginning and check 
whether the symbols agree with the strategy of Ben wc fixed. The difference appears 
only where Ben introduces a bad symbol. There we extend the sequence with 
this symbol and continue. This way we reconstruct the sequence of all symbols 
introduced in the game and clearly every second symbol is chosen by Ann. □ 

By a game walk we mean a sequence di, . . . , dm satisfying (ii) , (iii) and addi- 
tionally X]j=i "^i ^- Let Tm be the number of gamewalks of length m. By our 
assumption that Ann never wins, every feasible sequence of differences in a reduced 
log sums up to a number smaller than n. The number of sequences satisfying (ii), 
(iii) but with a total sum k (for fixed fc ^ 1) is 0{Tm)- Finally, all feasible sequences 
of differences are of size at most 2M. All this yields that the number of feasible 
difference sequences in a reduced log is at most 2M ■ n ■0{T2m)- For a given feasible 
sequence of differences -D, the number of final sequences which can occur with D in 
a reduced log is bounded by C". Thus, the number of reduced logs is bounded by 

2M ■ n ■ 0{T2m) ■ C". 

We turn to the approximation of T2j\/. Every game walk di, . . . , dm is either a 
single step up (i.e., m = 1, di = 1), or it can be uniquely decomposed into \dm\ + 1 
subsequent game walks of total length to — 1 . The jth component of the decomposi- 
tion is the substring between the last visit of height j — 1 and the last visit of height 
j (i.e. between the last k such that X)i=i =.7^1 and last / such that X]'=i '^i ~ j)- 
This description together with the fact that if to < 1, then |c?j\/| -1-1^4, certify 
that the generating function t{z) = X^nGAf satisfies the following functional 

equation: 

t{z) = z + z{t{z)* + t{z)^ + ...), 
where the right hand side is ^ + ^ i_t[z) ■ From this equation we extract a polynomial 

P{z, t) = zt^ + f-(l + z)t + z 

that defines t{z). In the standard way we calculate the discriminant polynomial 
obtaining: 

-4 - I9z + 32z2 - 2z^ + 362"* -I- 229z^ 

This polynomial has only one positive real root eaqual to 0.457..., which is greater 
than 5-5. Therefore T2n = o(5"). 

By the claim the number of realizations is exactly the number of reduced logs. 
That gives 

(C - 3)^^ ^2M-n- 0{T2m) ■ C" = o(5^0- 
Thus for C ^ 8 and sufficiently large Al we obtain a contradiction. □ 

4. The nonrepetitive game 

Theorem 2. In the nonrepetitive game over a symbol set of size 6, there is a 
strategy for Ann to build an arbitrarily long sequence with no repetitions of size 
greater than 1. 
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Proof. We fix n and prove that Ann has a strategy to build a sequence of size 
n without repetitions of size greater than 1. As before we consider randomized 
Ann's strategy and we show that for every strategy of Ben there is an evaluation of 
random experiments leading to the generation of a nonrepetitive sequence of size 
n. This means that Ben cannot have winning strategy. Therefore, there exists a 
winning strategy for Ann. Then, again, by a routine application of Konig's lemma 
we get the thesis. 

In this proof, by a repetition we mean a repetition of size greater than 1. 

Let si,...,Sm-i be the sequence already generated in the game and suppose 
that it is Ann's turn (to is odd). The strategy for Ann goes as follows: choose any 
symbol at random, but 

(i) exclude Sm-2, 

(ii) if Sm-i = Sm-4, then exclude s^-s, 

(iii) if only one symbol has been excluded in (i) and (ii), then exclude .Sm-4. 

This statcgy explicitly ensures that no repetitions of size 2 and 3 occur in the 
game. It turns out that also repetitions of size 4 are avoided. Suppose for a con- 
tradiction that at some point in the game a sequence with a suffix of the form 
xia;2a;3a;4a;ia;2X3a;4 is produced. Suppose also that Ann introduces the last symbol, 
namely x^. As she did not prevent a repetition of size 4, the rule (iii) of the strat- 
egy did not exclude a symbol and therefore rule (ii) must have been invoked. In 
particular, 0:3 = X4. But this means that in the previous move of Ann (when she 
introduced X2 in the repeated block) the symbols excluded by (i) and (ii) were the 
same, so, rule (iii) must have been applied. But that rule excludes X2, a contradic- 
tion. Analogous reasoning works for the case when Ben finishes a repetition of size 
4. 

Fix a strategy for Ben. We simulate the play between randomized Ann and this 
fixed strategy, and whenever a repetition of size h occurs in the TOth move (of the 
real game), we backtrack to the move m ~ h + 1. This means that we remove 
the whole repeated segment and continue the simulation starting from the move 
m — h + 1 again (with independent random experiments). 

A search sequence is the sequence of consecutive symbols chosen by players in 
the simulation. Note that it is not possible for Ben to introduce three symbols in 
a row in the search sequence. Indeed, if he introduces two symbols in a row, then 
there must have been a repetition (of even size) after the first symbol. Thus, the 
second one is the same as the symbol just erased at this position (as Ben's strategy 
is fixed in the simulation). This means that the second symbol could not generate 
repetition and therefore Ann is next to play in the simulation. 

The weight of a search sequence is the number of symbols chosen by Ann in the 
sequence. Fix M large enough. We are going to show that there is a scenario of 
the first M random experiments (first M moves of Ann) leading the simulation to 
an outcome sequence of size n. This will prove that Ann has a strategy to build a 
sequence of size n against the fixed strategy of Ben. For a contradiction we suppose 
that all outcome sequences generated after M moves of Ann in the simulation are 
of length less than n for all possible evaluations of random experiments. 

Clearly, a search sequence of weight M is uniquely determined by the sequence 
of M Ann's choices. Let ri, . . . , tm be the symbols chosen by Ann. As she always 
chooses one symbol out of at least C — 2 symbols, the sequence ri, . . . , ta/ has at 
least (C — 2)^^ possible evaluations. A search sequence induced by an evaluation 
of ri, . . . , rM is called a realization of this evaluation. 

Let hj be the length of the current sequence just before the jth step (move) of 
the simulation. The sequence (hj) is called a height sequence. If Ann introduces a 
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symbol in the jth step, then her next move is in step k G {j + 1, j + 2, j + 3}. There 
are only few possible extensions of the height sequence from hj up to hk : 

(0) Ann makes no repetition in the jth step and Ben makes no repetition in the 
(j + l)th step. In this case k = j + 2 and /ij+i = hj + 1, = hj + 2. 

(1) Ann makes a repetition of odd size, at least 5, in the jth step and therefore 
she plays again in the (j + l)th step. Here k = j + 1 and hk ^ hj — 4. 

(2) Ann makes a repetition of even size, at least 6, in the jth step and Ben plays 
no repetition in the (j + l)th step. Here k = j + 2 and ^ hj — 4. 

(3) Ann makes no repetition in the jth step and Ben produces a repetition of even 
size, at least 6, in the (j + l)th step. Here again A; = j + 2 and hk ^ hj — 4. 

(4) Ann makes no repetition in the jth step. Ben makes a repetition of odd size, 
at least 5, in the (j + l)th step. Then he plays no repetition in the (j + 2)th 
step. Here A: = j + 3 and hk ^ hj — 2. 

We want to get rid of some redundancy in the height sequence. More precisely, we 
encode the sequence of heights into its subsequence consisting of hj 's corresponding 
to Ann's moves with a little extra information. Let hj, hk be again the heights of 
the current sequence right before any two consecutive moves of Ann. Note that 

* li hk > hj, then the sequence of heights between hj and hk is of type (0). 

* If hk = hj — 2, then the sequence of heights between hj and hk is of type (4). 

* li hk ^ hj — 4, then the sequence of heights between hj and hk is of type (1), 
(2),(3) or (4). 

Therefore, in order to record the whole height sequence it is enough to remember the 
subsequence h'l, . . . , h'j^j of heights corresponding to Ann's moves and additionally, 
if h'j_^_l ^ h'j — 4, to record typc{h'j,h'jj^i) S {1,2,3,4}, which is the type of the 
original height sequence between symbols corresponding to h'j and /ij+i- 

Finally, note that all the /i^ 's are even (as the current sequence before Ann's 
move contains an even number of symbols). The reduced sequence of differences is: 
di = 1, dj+i = {h'j_^_l — h'j)/2 for 1 ^ j < M, and the type function type{dj+i) = 
type{hj, hj^i), provided the latter is defined. Note that 

(i) dj^l, 

(ii) Y.j=i dj > 1, for all 1 < < M, 

(iii) type(dj) is defined if and only if dj ^ —2. 

A pair {{D, type), S) is a search log if there is an evaluation of ri, . . . , tm such 
that D is the reduced sequence of differences in the realization of ri, . . . , rjv/, type 
is the type function of D, and S is the final sequence produced after M steps of 
Ann in this realization of the search procedure. 

Claim. Every search log corresponds to a unique evaluation o/ri, . . . ,ri\j. 

Proof. Given a search log {{D, type^,), S) where S = {si, . . . , si) we decode the eval- 
uation of ri, . . . , rjv/ in a few steps. First we extract the height sequence hi, ... , hm 
from (D,type£i) and put additionally hm+i = \S\. Now, we are going to describe 
how to reconstruct the sequence xi, . . . , Xm of all symbols introduced in the simula- 
tion. This is done in backward direction, i.e., we decode first Xm and the sequence 
Sm-i constructed after m — 1 steps of the simulation. Then by simple iteration we 
extract all the remaining symbols Xm~ii . . . ,xi. 

If hm+i — hm = 1, then the introduction of Xm did not invoke a repetition. Thus, 
Xm is the last symbol in the final sequence S, i.e., Xm = si and 5^-1 = (si, . . . , s/_i). 

If hm+i — hm ^ 0, then some symbols were erased after the introduction of 
Xm- But we know the size of the repetition, namely h ~ \hm+i — hm\ + 1, and 
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since only one half of it was erased, we can copy the appropriate block to restore 
Sm-i = (si, ■ • ■ , si,si-h+i, . . . , si-i) and Xm = si. 

Once we get all xi, . . . , Xm we read the sequence from the beginning and track 
the current sequence in the simulation. Every time the current sequence is of even 
length the next symbol is introduced by Ann. □ 

By a typed search walk we mean a pair ((di, . . . , dA/), type) satisfying (i), (ii), 
(iii) and additionally "^^Li "^j = 1- Let Tm be the number of search walks of length 
M (i.e., D is of length M). By our assumption that Ann never wins, every feasible 
sequence of differences in a typed search walk sums up to less than n. The number 
of typed search walks of length M satisfying (i) , (ii) , (iii) with total sum k (fixed 
fc ^ 1) is 0{Tm)- All this implies that the number of feasible typed search walks 
is n • 0{Tm)- For a given feasible typed search walk (L',type) the number of final 
sequences which can occur with (Z),type) in a search log is bounded by C". Thus, 
the number of reduced logs is bounded by 

n ■ 0{Tm) ■ C". 

We turn to the approximation of r,„. Every search walk ((di, . . . , dm), type) is 
either a single step up (i.e., to = 1, di = 1), or it can be uniquely decomposed into 
Mm I + 1 subsequent search walks of total length m — 1 and additionally into the 
type of dm if it is defined (i.e., if dm ^ —2). This decomposition gives the following 
functional equation for the generating function t{z): 

t{z) = z + zt^iz) + Az{t^{z) + t\z) + t^z) + ...), 

where z stands for a trivial one-stcp-up walk, zt^{z) stands for the case dm = — 1 
in which dm has no type, and the last term stands for the case dm ^ —2. The right 
hand side of the equation is in fact equal to z + zt^{z) + 4z iJf{l) ■ From that form 
we derive the defining polynomial for t{z): 

P{z, t) ^ -t + t'^ + z -tz + t^z + M^z. 

In the standard way we calculate the discriminant polynomial obtaining: 

-1 - 12z + 24z2 + 80z3 + 288z^. 

The radius of convergence of t{z) is one of the roots of the above polynomial. This 
polynomial has only one positive real root in 0.2537... The root is greater than 1/4, 
therefore Tm = o(4*^). 

By the claim, the number of realizations is exactly the number of search logs. 
That gives 

{C - < n ■ 0{Tm) ■ = o(4*^). 
Therefore for C ^ 6 and sufficiently large M we obtain a contradiction. □ 
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